SAFE: a statistical algorithm for F0 estimation for both clean and noisy speech
نویسندگان
چکیده
A novel Statistical Approach for F0 Estimation, SAFE, is proposed to improve the accuracy of F0 tracking under both clean and additive noise conditions. Prominent Signal-to-Noise Ratio (SNR) peaks in speech spectra are robust information source from which F0 can be inferred. A probabilistic framework is proposed to model the effect of additive noise on voiced speech spectra. It is observed that prominent SNR peaks located in the low frequency band are important to F0 estimation, and prominent SNR peaks in the middle and high frequency bands are also useful supplemental information to F0 estimation under noisy conditions, especially babble noise condition. Experiments show that the SAFE algorithm has the lowest Gross Pitch Errors (GPE) compared to prevailing F0 trackers: Get F0, Praat, TEMPO, and YIN, in white and babble noise conditions at low SNRs.
منابع مشابه
On a robust F0 estimation of speech based on IRAPT using robust TV-CAR analysis
Fundamental frequency (F0) estimation is important in speech processing such as speech coding, synthesis, recognition and so on. A present F0 estimation method performs well under clean condition, however the performance deteriorates significantly in noisy environment. As a result, robust F0 estimation against additive noise is demanded. We have previously proposed F0 estimation methods based o...
متن کاملA fundamental frequency estimation method for noisy speech based on instantaneous amplitude and frequency
This paper proposes a robust and accurate F0 estimation method for noisy speech. This method uses two different principles: (1) an F0 estimation based on periodicity and harmonicity of instantaneous amplitude for a robust estimation in noisy environments, and (2) an F0 estimation based on stability of instantaneous frequency as an accurate estimation method. The proposed method also uses a comb...
متن کاملSpeech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions
Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...
متن کاملImproving YANGsaf F0 Estimator with Adaptive Kalman Filter
We present improvements to the refinement stage of YANGsaf[1] (Yet ANother Glottal source analysis framework), a recently published F0 estimation algorithm by Kawahara et al., for noisy/breathy speech signals. The baseline system, based on time-warping and weighted average of multi-band instantaneous frequency estimates, is still sensitive to additive noise when none of the harmonic provide rel...
متن کاملروشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه
Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010